AITopics | synthetic 0

Collaborating Authors

synthetic 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HCT-QA: A Benchmark for Question Answering on Human-Centric Tables

Ahmad, Mohammad S., Naeem, Zan A., Aupetit, Michaël, Elmagarmid, Ahmed, Eltabakh, Mohamed, Ma, Xiasong, Ouzzani, Mourad, Ruan, Chaoyi

arXiv.org Artificial IntelligenceNov-4-2025

Tabular data embedded within PDF files, web pages, and other document formats are prevalent across numerous sectors such as government, engineering, science, and business. These human-centric tables (HCTs) possess a unique combination of high business value, intricate layouts, limited operational power at scale, and sometimes serve as the only data source for critical insights. However, their complexity poses significant challenges to traditional data extraction, processing, and querying methods. While current solutions focus on transforming these tables into relational formats for SQL queries, they fall short in handling the diverse and complex layouts of HCTs and hence being amenable to querying. This paper describes HCT-QA, an extensive benchmark of HCTs, natural language queries, and related answers on thousands of tables. Our dataset includes 2,188 real-world HCTs with 9,835 QA pairs and 4,679 synthetic tables with 67.5K QA pairs. While HCTs can be potentially processed by different type of query engines, in this paper, we focus on Large Language Models as potential engines and assess their ability in processing and querying such tables.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2504.20047

Country:

North America > United States (0.46)
Asia > Pakistan (0.28)
North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Information Technology (0.92)
Government > Regional Government (0.46)
Transportation > Passenger (0.46)
Transportation > Air (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WebInject: Prompt Injection Attack to Web Agents

Wang, Xilong, Bloch, John, Shao, Zedian, Hu, Yuepeng, Zhou, Shuyan, Gong, Neil Zhenqiang

arXiv.org Artificial IntelligenceOct-20-2025

Multi-modal large language model (MLLM)-based web agents interact with webpage environments by generating actions based on screenshots of the webpages. In this work, we propose WebInject, a prompt injection attack that manipulates the webpage environment to induce a web agent to perform an attacker-specified action. Our attack adds a perturbation to the raw pixel values of the rendered webpage. After these perturbed pixels are mapped into a screenshot, the perturbation induces the web agent to perform the attacker-specified action. We formulate the task of finding the perturbation as an optimization problem. A key challenge in solving this problem is that the mapping between raw pixel values and screenshot is non-differentiable, making it difficult to backpropagate gradients to the perturbation. To overcome this, we train a neural network to approximate the mapping and apply projected gradient descent to solve the reformulated optimization problem. Extensive evaluation on multiple datasets shows that WebInject is highly effective and significantly outperforms baselines.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.11717

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Quantum SMOTE with Angular Outliers: Redefining Minority Class Handling

Mohanty, Nishikanta, Behera, Bikash K., Ferrie, Christopher

arXiv.org Artificial IntelligenceJan-31-2025

This paper introduces Quantum-SMOTEV2, an advanced variant of the Quantum-SMOTE method, leveraging quantum computing to address class imbalance in machine learning datasets without K-Means clustering. Quantum-SMOTEV2 synthesizes data samples using swap tests and quantum rotation centered around a single data centroid, concentrating on the angular distribution of minority data points and the concept of angular outliers (AOL). Experimental results show significant enhancements in model performance metrics at moderate SMOTE levels (30-36%), which previously required up to 50% with the original method. Quantum-SMOTEV2 maintains essential features of its predecessor (arXiv:2402.17398), such as rotation angle, minority percentage, and splitting factor, allowing for tailored adaptation to specific dataset needs. The method is scalable, utilizing compact swap tests and low depth quantum circuits to accommodate a large number of features. Evaluation on the public Cell-to-Cell Telecom dataset with Random Forest (RF), K-Nearest Neighbours (KNN) Classifier, and Neural Network (NN) illustrates that integrating Angular Outliers modestly boosts classification metrics like accuracy, F1 Score, AUC-ROC, and AUC-PR across different proportions of synthetic data, highlighting the effectiveness of Quantum-SMOTEV2 in enhancing model performance for edge cases.

artificial intelligence, machine learning, minority, (16 more...)

arXiv.org Artificial Intelligence

2501.19001

Country:

Asia > Singapore (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
Oceania > Australia (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Graph Vertex Embeddings: Distance, Regularization and Community Detection

Nowak, Radosław, Małkowski, Adam, Cieślak, Daniel, Sokół, Piotr, Wawrzyński, Paweł

arXiv.org Artificial IntelligenceApr-9-2024

Graph embeddings have emerged as a powerful tool for representing complex network structures in a low-dimensional space, enabling the use of efficient methods that employ the metric structure in the embedding space as a proxy for the topological structure of the data. In this paper, we explore several aspects that affect the quality of a vertex embedding of graph-structured data. To this effect, we first present a family of flexible distance functions that faithfully capture the topological distance between different vertices. Secondly, we analyze vertex embeddings as resulting from a fitted transformation of the distance matrix rather than as a direct result of optimization. Finally, we evaluate the effectiveness of our proposed embedding constructions by performing community detection on a host of benchmark datasets. The reported results are competitive with classical algorithms that operate on the entire graph while benefitting from a substantially reduced computational complexity due to the reduced dimensionality of the representations.

affinitypropagation 2 0, graph, vertex, (17 more...)

arXiv.org Artificial Intelligence

2404.10784

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Asia > Singapore (0.04)
Asia > Nepal (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Real, fake and synthetic faces -- does the coin have three sides?

Naeem, Shahzeb, Al-Sharawi, Ramzi, Khan, Muhammad Riyyan, Tariq, Usman, Dhall, Abhinav, Al-Nashash, Hasan

arXiv.org Artificial IntelligenceApr-2-2024

With the ever-growing power of generative artificial intelligence, deepfake and artificially generated (synthetic) media have continued to spread online, which creates various ethical and moral concerns regarding their usage. To tackle this, we thus present a novel exploration of the trends and patterns observed in real, deepfake and synthetic facial images. The proposed analysis is done in two parts: firstly, we incorporate eight deep learning models and analyze their performances in distinguishing between the three classes of images. Next, we look to further delve into the similarities and differences between these three sets of images by investigating their image properties both in the context of the entire image as well as in the context of specific regions within the image. ANOVA test was also performed and provided further clarity amongst the patterns associated between the images of the three classes. From our findings, we observe that the investigated deeplearning models found it easier to detect synthetic facial images, with the ViT Patch-16 model performing best on this task with a class-averaged sensitivity, specificity, precision, and accuracy of 97.37%, 98.69%, 97.48%, and 98.25%, respectively. This observation was supported by further analysis of various image properties. We saw noticeable differences across the three category of images. This analysis can help us build better algorithms for facial image generation, and also shows that synthetic, deepfake and real face images are indeed three different classes.

arxiv, synthetic 0, synthetic image, (14 more...)

arXiv.org Artificial Intelligence

2404.01878

Country:

Asia > Middle East > UAE > Sharjah Emirate > Sharjah (0.04)
Oceania > Australia > South Australia (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.98)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Analyzing Effects of Fake Training Data on the Performance of Deep Learning Systems

Seth, Pratinav, Bhandari, Akshat, Lakara, Kumud

arXiv.org Artificial IntelligenceMar-2-2023

Deep learning models frequently suffer from various problems such as class imbalance and lack of robustness to distribution shift. It is often difficult to find data suitable for training beyond the available benchmarks. This is especially the case for computer vision models. However, with the advent of Generative Adversarial Networks (GANs), it is now possible to generate high-quality synthetic data. This synthetic data can be used to alleviate some of the challenges faced by deep learning models. In this work we present a detailed analysis of the effect of training computer vision models using different proportions of synthetic data along with real (organic) data. We analyze the effect that various quantities of synthetic data, when mixed with original data, can have on a model's robustness to out-of-distribution data and the general quality of predictions.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.01268

Country: Asia > India (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback